Adding Dockerfile and scripts for building and starting #457
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
A PR for a first draft of docker as discussed in #378
A few notes:
What is important:
DOCKER_README.md
explains all the optional arguments to the scripts in detail. I recommend you read this one first.docker
was introducedconda
is a subfolder with docker files building anenvironment.yml
file. This file is used in all the other docker builds to ensure that the environments are identical. This is also what is doing the trick to ensure that DeepSpeed's environment is compatible with alltalk's environment.deepspeed
is a subfolder to build DeepSpeed. It uses the conda environment file mentioned.versions.sh
lists important variables. This should make it even simpler to bump versions.docker-build.sh
anddocker-start.sh
that make it super simple to use it. If you want to understand the whole magic, those 2 files are a good start.docker-build.sh
internally makes sure that the conda environment and DeepSpeed are built - no need to invoke them manually.Less important, but still noteworthy:
pytorch-cuda
. However: